Skip to content

geotiff: pin validated IP into HTTP source to close DNS-rebind TOCTOU#1853

Merged
brendancol merged 1 commit into
mainfrom
issue-1846
May 14, 2026
Merged

geotiff: pin validated IP into HTTP source to close DNS-rebind TOCTOU#1853
brendancol merged 1 commit into
mainfrom
issue-1846

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1846.

_validate_http_url resolves the hostname via socket.getaddrinfo and rejects any URL that resolves to a private / loopback / link-local IP. urllib3 then resolves the hostname a second time at connect-time. A hostile DNS server can return a public IP to the validator and a private IP at connect (classic DNS rebinding / TOCTOU), defeating the SSRF guard. The existing code comment honestly labels the check "best-effort" for this reason.

Fix

  • _validate_http_url now returns the first validated public IP (or None when the escape hatch is set).
  • _HTTPSource builds a per-hop urllib3 HTTP[S]ConnectionPool whose custom _PinnedHTTP[S]Connection._new_conn dials that exact IP via socket.create_connection, never re-consulting DNS.
  • self.host stays the original hostname so the HTTP Host header (virtual hosting) and TLS SNI / cert verification still use the name, not the IP.
  • server_hostname is passed to HTTPSConnectionPool so HTTPS certs are still validated against the original hostname.
  • Pools are cached per (scheme, host, port, ip) tuple on the source. Range requests against the same hop reuse TCP/TLS; redirect hops to a different host get their own pool.

Residual scope

  • Within a single hop the IP is pinned. Across redirects each hop is independently resolved, validated, and pinned (a redirect to a new host gets a fresh pool).
  • An attacker who legitimately controls a hostname with multiple public IPs can influence which one we pick (we take the first). They cannot make us connect to a private IP.
  • XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 returns None from the validator and falls back to the shared PoolManager. Localhost dev/test flows still work, with the trade-off being no pin (same as before).

Test plan

  • New file test_dns_rebinding_pin_issue_1846.py (9 tests):
    • Validator returns the first public IP (v4 and v6).
    • Validator returns None when the escape hatch is on.
    • Validator still raises if any resolved IP is private.
    • _HTTPSource.__init__ records the pinned IP.
    • End-to-end rebinding: getaddrinfo returns 93.184.216.34 to the validator, then 127.0.0.1 afterwards. The intercepted socket.create_connection confirms the TCP target is 93.184.216.34.
    • Host header and SNI stay set to the original hostname.
    • Redirect to a safe-but-different host re-resolves and re-validates the new hostname.
    • Existing redirect-to-private guard still fires.
  • Existing test_ssrf_hardening_1664.py (46 tests) still pass.
  • All HTTP-related tests across the geotiff suite still pass (125 tests).
  • Full geotiff suite: 2407 passed (the 11 failures are pre-existing on main and unrelated -- matplotlib/Python 3.14 deepcopy recursion in test_features.py, GPU predictor tests, and a tile-size validation test).

Refs #1664.

…#1846)

`_validate_http_url` resolves the hostname and rejects private IPs at
construction (and on each redirect). urllib3 then resolves the hostname
a *second* time at connect-time. A hostile DNS server can return a
public IP to the validator and a private IP at connect, slipping past
the SSRF guard (the existing comment honestly labels the check
"best-effort" for this reason).

Fix: `_validate_http_url` now returns the first validated public IP.
`_HTTPSource` builds a per-hop urllib3 `HTTP[S]ConnectionPool` whose
custom `_PinnedHTTP[S]Connection._new_conn` dials that exact IP via
`socket.create_connection` without re-consulting DNS. `self.host` stays
set to the original hostname (so the HTTP Host header and TLS SNI /
cert verification still use the name), and `server_hostname` is passed
to `HTTPSConnectionPool` so cert hostname checks run against the name,
not the IP.

Pools are cached per `(scheme, host, port, ip)` tuple on the source so
range requests against the same hop reuse TCP/TLS. Each redirect
target is freshly re-validated and re-pinned (a hop to a new host
gets its own pool). The escape hatch
`XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1` returns `None` from the
validator and falls back to the shared `PoolManager`, so localhost
dev/test flows stay working.

Residual scope: within a single hop the IP is pinned; across redirects
each hop is independently resolved and validated. An attacker who
legitimately owns multiple public IPs on a hostname can influence
which one we pick (we take the first); they cannot redirect us to a
private IP.

Tests:
- Rebinding scenario: validator sees 93.184.216.34; subsequent
  `getaddrinfo` calls return 127.0.0.1. The intercepted
  `socket.create_connection` confirms the TCP target is the validated
  public IP.
- Host header and SNI stay set to the original hostname.
- Redirect to a different safe host re-resolves and re-validates the
  new hostname.
- Existing redirect-to-private guard still fires.

Closes #1846.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 14, 2026
@brendancol brendancol requested a review from Copilot May 14, 2026 17:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the GeoTIFF HTTP range reader against DNS-rebinding / TOCTOU by pinning the validated (public) IP address into the actual TCP connection, so urllib3 can’t re-resolve the hostname to a different (potentially private) IP at connect-time.

Changes:

  • Update _validate_http_url() to return the first validated public IP (or None when XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 is set) so callers can pin connections.
  • Add pinned-IP urllib3 HTTP[S]Connection + per-hop HTTP[S]ConnectionPool construction/caching in _HTTPSource.
  • Add a new test module covering validator behavior, TCP target pinning, host/SNI preservation, and redirect re-validation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
xrspatial/geotiff/_reader.py Implements IP pinning by introducing custom urllib3 connection/pool logic and wiring it into _HTTPSource’s redirect-following request loop.
xrspatial/geotiff/tests/test_dns_rebinding_pin_issue_1846.py Adds targeted regression tests for DNS-rebinding prevention and hop-by-hop re-validation behavior.
Comments suppressed due to low confidence (1)

xrspatial/geotiff/_reader.py:921

  • When pinned_ip is set, _pool_for_request can return a urllib3.HTTPConnectionPool/HTTPSConnectionPool, but _request always calls pool.request('GET', current_url, ...) with an absolute URL. ConnectionPool.request() expects a request target/path (e.g. /cog.tif?x=1), not a full https://host/... URL; passing an absolute URL typically produces an invalid origin-form request line (or a path like /https://host/...) and can break real HTTP range reads on the pinned path. Consider deriving the request target from urlparse(current_url) (path + optional ?query) when using a per-host pool, while keeping the absolute URL for the PoolManager path.
            pool = self._pool_for_request(current_url, current_pin)
            resp = pool.request(
                'GET', current_url,
                headers=headers,
                timeout=timeout,
                redirect=False,
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@brendancol brendancol merged commit 30b5356 into main May 14, 2026
15 of 16 checks passed
@brendancol brendancol deleted the issue-1846 branch May 15, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pin validated IP into HTTP source to close DNS-rebinding TOCTOU

2 participants